Retargeting cued speech hand gestures for different talking heads and speakers
نویسندگان
چکیده
Cued Speech is a communication system that complements lip-reading with a small set of possible handshapes placed in different positions near the face. Developing a Cued Speech capable system is a time-consuming and difficult challenge. This paper focuses on how an existing bank of reference Cued Speech gestures, exhibiting natural dynamics for hand articulation and movements, could be reused for another speaker (augmenting some video or 3D talking heads). Any Cued Speech hand gesture should be recorded or considered with the concomitant facial locations that Cued Speech specifies to leverage the lip reading ambiguities (such as lip corner, chin, cheek and throat for French). These facial target points are moving along with head movements and because of speech articulation. The post-treatment algorithm proposed here will retarget synthesized hand gestures to another face, by slightly modifying the sequence of translations and rotations of the 3D hand. This algorithm preserves the coarticulation of the reference signal (including undershooting of the trajectories, as observed in fast Cued Speech) while adapting the gestures to the geometry, articulation and movements of the target face. We will illustrate how our Cued Speech capable audiovisual synthesizer – built using simultaneously recorded hand trajectories and facial articulation of a single French Cued Speech user – can be used as a reference signal for this retargeting algorithm. For the ongoing evaluation of our algorithm, an intelligibility paradigm has been retained, using natural videos for the face. The intelligibility of some video VCV sequences with composited hand gestures for Cued Speech is being measured using a panel of Cued Speech users.
منابع مشابه
Mental Timeline in Persian Speakers’ Co-speech Gestures based on Lakoff and Johnson’s Conceptual Metaphor Theory
One of the introduced conceptual metaphors is the metaphor of "time as space". Time as an abstract concept is conceptualized by a concrete concept like space. This conceptualization of time is also reflected in co-speech gestures. In this research, we try to find out what dimension and direction the mental timeline has in co-speech gestures and under the influence of which one of the metaphoric...
متن کاملAudio-Visual Prosody: Perception, Detection, and Synthesis of Prominence
In this chapter, we investigate the effects of facial prominence cues, in terms of gestures, when synthesized on animated talking heads. In the first study a speech intelligibility experiment is conducted, where speech quality is acoustically degraded, then the speech is presented to 12 subjects through a lip synchronized talking head carrying head-nods and eyebrow raising gestures. The experim...
متن کاملOstensive signals: markers of communicative relevance of gesture during multimodal demonstrations to adults and children
Speakers adapt their speech and gestures in various ways for their audience. We investigated further whether they use ostensive signals (eye gaze, ostensive speech (e.g. like this, this) or a combination of both) in relation to their gestures when talking to different addressees, i.e., to another adult or a child in a multimodal demonstration task. While adults used more eye gaze towards their ...
متن کاملA pilot study of temporal organization in Cued Speech production of French syllables: rules for a Cued Speech synthesizer
This study investigated the temporal coordination of the articulators involved in French Cued Speech. Cued Speech is a manual complement to lipreading. It uses handshapes and hand placements to disambiguate series of CV syllables. Hand movements, lip gestures and acoustic data were collected from a speaker certified in manual Cued Speech uttering and coding CV sequences. Experiment I studied ha...
متن کاملToward an audiovisual synthesizer for Cued Speech: Rules for CV French syllables
Manual Cued Speech is an effective method used to enhance speech perception for hearing-impaired people. Thanks to this system, a speaker can clarify what has been said with the help of hand gestures. Seeing manual cues associated to lip shapes allows the cue receiver to identify speech elements unambiguously. A large amount of work has been devoted to Cued Speech effectiveness in visual identi...
متن کامل